53 research outputs found
Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis
We consider the task of fine-grained sentiment analysis from the perspective
of multiple instance learning (MIL). Our neural model is trained on document
sentiment labels, and learns to predict the sentiment of text segments, i.e.
sentences or elementary discourse units (EDUs), without segment-level
supervision. We introduce an attention-based polarity scoring method for
identifying positive and negative text snippets and a new dataset which we call
SPOT (as shorthand for Segment-level POlariTy annotations) for evaluating
MIL-style sentiment models like ours. Experimental results demonstrate superior
performance against multiple baselines, whereas a judgement elicitation study
shows that EDU-level opinion extraction produces more informative summaries
than sentence-based alternatives.Comment: Final published version. Please cite using appropriate date (2018).
Link to journal:
http://www.transacl.org/ojs/index.php/tacl/article/view/1225/27
Learning Structured Text Representations
In this paper, we focus on learning structure-aware document representations
from data without recourse to a discourse parser or additional annotations.
Drawing inspiration from recent efforts to empower neural networks with a
structural bias, we propose a model that can encode a document while
automatically inducing rich structural dependencies. Specifically, we embed a
differentiable non-projective parsing algorithm into a neural model and use
attention mechanisms to incorporate the structural biases. Experimental
evaluation across different tasks and datasets shows that the proposed model
achieves state-of-the-art results on document modeling tasks while inducing
intermediate structures which are both interpretable and meaningful.Comment: change to one-based indexing, published in Transactions of the
Association for Computational Linguistics (TACL),
https://transacl.org/ojs/index.php/tacl/article/view/1185/28
The Automatic Interpretation of Nominalizations
This paper discusses the interpretation of nominalizations in domain independent wide-coverage text. We present a statistical model which interprets nominalizations based on the cooccurrence of verb-argument tuples in a large balanced corpus. We propose an algorithm which treats the interpretation task as a disambiguation problem and achieves a performance of approximately 80 % by combining partial parsing, smoothing techniques and domain independent taxonomic information (e.g., WordNet)
Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations
This paper examines the extent to which verb aliathesis alternations are empirically attested in corpus data. We automatically acquire alternating verbs from large balanced corpora by using partialparsing methods and taxonomic information, and discuss how corpus data can be used to quantify linguistic generalizations. We estimate the productivity of an alternation and the typicality of its members using type and token frequencies
A Corpus-based Account of Regular Polysemy: The Case of Context-sensitive Adjectives
In this paper we investigate polysemous adjectives whose meaning varies depending on the nouns they modify (e.g., fast ). We acquire the meanings of these adjectives from a large corpus and propose a probabilistic model which provides a ranking on the set of possible interpretations. We identify lexical semantic information automatically by exploiting the consistent correspondences between surface syntactic cues and lexical meaning. We evaluate our results against paraphrase judgments elicited experimentally from humans and show that the model's ranking of meanings correlates reliably with human intuitions: meanings that are found highly probable by the model are also rated as plausible by the subjects
Semantic Graph Parsing with Recurrent Neural Network DAG Grammars
Semantic parses are directed acyclic graphs (DAGs), so semantic parsing
should be modeled as graph prediction. But predicting graphs presents difficult
technical challenges, so it is simpler and more common to predict the
linearized graphs found in semantic parsing datasets using well-understood
sequence models. The cost of this simplicity is that the predicted strings may
not be well-formed graphs. We present recurrent neural network DAG grammars, a
graph-aware sequence model that ensures only well-formed graphs while
sidestepping many difficulties in graph prediction. We test our model on the
Parallel Meaning Bank---a multilingual semantic graphbank. Our approach yields
competitive results in English and establishes the first results for German,
Italian and Dutch.Comment: 9 pages, to appear in EMNLP201
- …